|
In computer science, radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. A positional notation is required, but because integers can represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers. Radix sort dates back as far as 1887 to the work of Herman Hollerith on tabulating machines.〔 and 〕 Most digital computers internally represent all of their data as electronic representations of binary numbers, so processing the digits of integer representations by groups of binary digit representations is most convenient. Two classifications of radix sorts are least significant digit (LSD) radix sorts and most significant digit (MSD) radix sorts. LSD radix sorts process the integer representations starting from the least digit and move towards the most significant digit. MSD radix sorts work the other way around. LSD radix sorts typically use the following sorting order: short keys come before longer keys, and keys of the same length are sorted lexicographically. This coincides with the normal order of integer representations, such as the sequence 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. MSD radix sorts use lexicographic order, which is suitable for sorting strings, such as words, or fixed-length integer representations. A sequence such as "b, c, d, e, f, g, h, i, j, ba" would be lexicographically sorted as "b, ba, c, d, e, f, g, h, i, j". If lexicographic ordering is used to sort variable-length integer representations, then the representations of the numbers from 1 to 10 would be output as 1, 10, 2, 3, 4, 5, 6, 7, 8, 9, as if the shorter keys were left-justified and padded on the right with blank characters to make the shorter keys as long as the longest key for the purpose of determining sorted order. ==Efficiency== The topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky and subject to quite a lot of misunderstandings. Whether radix sort is equally efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made. Radix sort complexity is for keys which are integers of word size . Sometimes is presented as a constant, which would make radix sort better (for sufficiently large ) than the best comparison-based sorting algorithms, which all perform comparisons to sort keys. However, in general cannot be considered a constant: if all keys are distinct, then has to be at least for a random-access machine to be able to store them in memory, which gives at best a time complexity . That would seem to make radix sort at most equally efficient as the best comparison-based sorts (and worse if keys are much longer than ). The counter argument is the comparison-based algorithms are measured in number of comparisons, not actual time complexity. Under some assumptions the comparisons will be constant time on average, under others they will not. Comparisons of randomly-generated keys takes constant time on average, as keys differ on the very first bit in half the cases, and differ on the second bit in half of the remaining half, and so on, resulting in an average of two bits that need to be compared. In a sorting algorithm the first comparisons made satisfies the randomness condition, but as the sort progresses the keys compared are clearly not randomly chosen anymore. For example, consider a bottom-up merge sort. The first pass will compare pairs of random keys, but the last pass will compare keys that are very close in the sorting order. The deciding factor is how the keys are distributed. The best case for radix sort is that they are taken as consecutive bit patterns. This will make the keys as short as they can be, still assuming they are distinct. This makes radix sort , but the comparison based sorts will not be as efficient, as the comparisons will not be constant time under this assumption. If we instead assume that the keys are bit patterns of length for a constant and base-two logarithm, and that they are uniformly random, then radix sort will still be , but so will the comparison based sorts, as the "extra" length makes even the keys that are consecutive in the sorted result differ enough that comparisons are constant time on average. If keys are longer than , but random, then radix sort will be inferior. There are many other assumptions that can be made as well, and most require careful study to make a correct comparison. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Radix sort」の詳細全文を読む スポンサード リンク
|